I use a patched version of Aram Kocharyan’s Crayon Syntax Highlighter as a syntax highlighter plugin for WordPress, currently version 5.3.2 running on PHP 7.3.14. I wanted the plugin to highlight PowerShell Core scripts in the same way that the PSReadLine module does.
Configuring the theme
I created a new Crayon theme Solarized Dark PS that mapped the PSReadLine colours to the plugin elements as follows:
| PSReadLine Color option | Plugin element | 
|---|---|
| Comment | COMMENT | 
| Keyword | KEYWORD | 
| Command and Member | STATEMENT | 
| Parameter | RESERVED | 
| Operator | OPERATOR | 
| DefaultToken | IDENTIFIER and Unhighlighted | 
| Type | TYPE | 
| Number | CONSTANT | 
| String | STRING | 
| Variable | VARIABLE | 
Configuring the language
The language grammar for PowerShell supplied with Crayon was a follows (for those elements which reference the default grammar, I have added the default as a following comment):
| 
					 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16  | 
						  COMMENT             ((?<!`)#.*?$)|((?<!`)<#.*?(?<!`)#>)   HERESTRING:STRING   ((?<!`)(@\".*?^\s*\"@))|((?<!`)(@\'.*?^\s*\'@))   STRING              ((?<!`)".*?(?<!`)")|((?<!`)'.*?')           FUNCTIONS:RESERVED  (\b(?alt:reserved.txt)\b)|((?-i)[A-Z]\w+-[A-Z]\w+(?i))   STATEMENT           \b(?alt:statement.txt)\b   TYPE                \b(?alt:type.txt)\b       ENTITY              (?default) # ENTITY              (\b[a-z_]\w*\b(?=\s*\([^\)]*\)))|((?<!\.)(\b[a-z_]\w*\b)(?=[^}=|,.:;"'\)]*{))|(\b[a-z_]\w+\b\s+(?=\b[a-z_]\w+\b))   VARIABLE            \$[A-Za-z_]\w*\b   IDENTIFIER          (?default) # IDENTIFIER          \b[A-Za-z_]\w*\b   CONSTANT            -\w+\b   OPERATOR            (?default) # OPERATOR            (?alt:operator.txt)   SYMBOL              (?default) # SYMBOL              &[^;]+;|(?alt:symbol.txt)  | 
					
Crayon’s language grammar is limited in what it can express. The regular expressions for each element are combined to form a single regular express with alternative patterns:
(?:(regex1)|(regex2)| ... |(regexn))
That means that the code to be highlighted can only be analysed into mutually exclusive captures. PHP 7.3 uses Perl Compatible Regular Expressions 2 (PCRE 2) and lookbehind assertions must be of fixed length. That means the ability to detect context is limited.
I considered the supplied grammar to be lacking in certain respects, so I replaced it with the following:
| 
					 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15  | 
						COMMENT               <(?<!`)#[\s\S]*?(?<!`)#>|(?<!`)(#.*?$) HERESTRING:STRING     (?<!`)(@\"[\s\S]*?^\s*\"@)|(?<!`)(@\'[\s\S]*?^\s*\'@) STRING                (?<!`)".*?(?<!`)"|(?<!`)'.*?' VARIABLE              \$([$?^]|(\w[\w?]*:)?\w[\w?]*)|@(\w[\w?]*:)?\w[\w?]* WORDOPERATOR:OPERATOR (?<![\w?-])-(?alt:word-operator.txt)(?![\w?-]) RESERVED              (?<=\s)-[A-Z_?][\w?-]+:*?     KEYWORD               \b(?alt:statement.txt)(?!-)\b FUNDEF:IDENTIFIER     (?<=[^-]function\s|[^-]filter\s)\s*[A-Z_][\w-]* MEMBER:STATEMENT      (?<=\.)[A-Z_][\w-]* TYPE                  \b(?alt:type.txt)\b STATEMENT             \b[A-Z_][\w-]*\b OPERATOR              (?alt:operator.txt) INT:CONSTANT          (0x[\dA-F]+|\d+)(ul|us|uy|[lnsuy])?([kmgtp]b)? CONSTANT              (\d*\.\d+(e[+\-]?\d+)?|\d+e[+\-]?\d+)d?([kmgtp]b)?  ARITH:OPERATOR        [+-]  | 
					
I match block comments (<# … #>) before end of line comments (# …), and allow the former to span multiple lines, by using [\s\S]*? rather than .*?.
I also allow here-strings (@" … "@ or @' … '@) to span multiple lines.
I assume that with the exception of $$, $? and $^, variables have the form $(\w[\w?]*:)?(\w[\w?]*); commands, functions and members have the form ([A-Z_][\w]*)(-[A-Z_][\w]*)*; and parameters have the form -[A-Z_?][\w?-]+:*?. I also assume that parameters are preceded by a space.
After comments and strings, I first check for variables. I then check for operators that are words (for example, -eq), parameters, keywords, identifiers defined by function, members, types, commands and then other operators except + and -, which appear in certain literal real numbers.